Video surveillance (CCTV) is a technology that is nowadays deeply woven into the everyday life of many people as one tends to expect it in many varied circumstances (Ossola, 2019). The rationale behind the installation of these systems seems to be very clear for governments. For example, on Buffalo’s (NY) open data website, one can read that “the City of Buffalo deploys a real-time, citywide video surveillance system to augment the public safety efforts of the Buffalo Police Department”. Yet, the development of this new technology, is not exempt from any controversy. For instance, many observers claim that the expansion of video surveillance poses an unregulated threat to privacy (ACLU, 2021). Still, many people seem to be willing to accept this loss in privacy as the surge in video surveillance makes them feel safer (Madden & Rainie, 2015).
Throughout this research, we challenge the widespread belief that people who have “nothing to hide” should be content with the expansion of CCTV networks as the latter makes them safer (Madden & Rainie, 2015). Indeed, on top of many privacy issues linked with this surge in video surveillance systems, one might legitimately ask the question whether these cameras actually make people safer?
The goal of this project in the first phase is to investigate the crime deterrent potential of CCTVs in an Amercian city. This potential will also be compared to the different types of crime that are committed in this area. Over a second phase, the dispersion of CCTVs within the city will be investigated. Indeed, according to some researches, mass surveillance has a stronger impact on communities already disadvantaged by their poverty, race, religion, ethnicity, or immigration status (Gellman & Adler-Bell, 2017). We would like to see whether our data enables us to validate or invalidate this theory. It would also be extremely interesting, even though challenging, to see whether the installation of surveillance systems could potentially create even more pernicious issues such as crime displacements (Waples, Gill & Fisher, 2009).
In sum we argue that, in a world where CCTVs and other surveillance systems are flourishing, it might be beneficial to take a step back and question both the efficacy and the implementation design of such technologies, since they are often portrayed by different stakeholders as miraculous solutions to very complex issues.
Augustin: Augustin obtained a degree in Business Administration at the University of St-Gallen where he had the opportunity to develop a strong interest in digital business ethics. He wrote his bachelor’s thesis on the privacy implications of the use of fear appeals in home surveillance devices’ marketing strategy.
Marine: Marine made a bachelor in Law at the UBO (Université de Bretagne-Occidentale). She is presently into the Master DCS (Droit, Criminalité et Sécurité des technologies de l’information) at the Unversity of Lausanne. Last year, she had the opportunity to take a data protection course and learn more about cyber security and crime in general.
Daniel: Daniel is an exchange student from Koblenz, Germany. Daniel obtained a bachelor’s degree in Business Administration/Management at the WHU - Otto Beisheim School of Management, Germany. He is currently pursuing a Master of Management with focusing on family businesses, entrepreneurship and data science in his courses. Interestingly regarding this project, Daniel spend several months in the United states after high school and thus he can relate to the topic about police violence and crimes in the US.
Firstly, from our respective backgrounds, we derive a strong interest in new technologies and privacy. We believe that every person is entitled to the fundamental right to privacy. Unfortunately, one observes an increasing tendency of governments and other stakeholders (e.g. businesses such as GAFA (Google, Amazon, Facebook, Apple)) to take more and more control in our daily lives through digital technologies such as cameras, computers or smartphones. For these reasons it is interesting to ask ourselves if this massive collection of our data leads to more security or more restrictions of our freedom.
Secondly, if we look at European law like the GDPR, collection and processing of our data must be proportionate to the purpose of that processing. Therefore, it is of our interest to determine if these applications are the same in the United States and to see if the installation of cameras, with the objective of security, really allows to reduce crime and to make a city more secure.
Thirdly, it must also be said that crime and the legislative discussions regarding the right to wear a gun in the United-States are fascinating. At first, it seems as if the freedom to carry a gun makes the US more prone to crimes such as mass shootings. To verify or falsify our hypotheses, we also want to see through the datasets we obtained, what kind of crime prevails in American cities and how it evolves according to the districts and their particularities.
We have four raw data sets. All data sets were retrieved on baltimore government open data portal. We found data about crimes committed in Baltimore, CCTV location in the city and poverty rates. We also found a data set showing the reference boundaries of the Community Statistical Area geographies. The latter will certainly be helpful to match each data set’s observations together.
This dataset represents the location and characteristics of major crime against persons such as homicide, shooting, robbery, aggrevated assault etc. within the City of Baltimore. This dataset contains 350’294 observations.
RowID = ID of the row, 350’294 in total
CrimeDateTime = date and time of the crime. Format yyyy/mm/dd hh:mm:sstzd
CrimeCode = Code corresponding to the type of crime committed
Location = Textual information on where the crime was committed
Description = Textual description of the crime committed corresponding to a CrimeCode.
Inside/Outside = Provides information on whether crime was committed inside or outside
Weapon = Provides details on what weapon has been used, if any
Post = Number corresponding to the Police Post concerned. A map with corresponding police posts can be found here: http://moit.baltimorecity.gov/sites/default/files/police_districts_w_posts.pdf?__cf_chl_captcha_tk__=pmd_NhnE710SS8QEWdKOyT5Ug6IJZGoF6iIntFYY30vctes-1634309136-0-gqNtZGzNAxCjcnBszQPl
District = Name of the district, regrouping different neighbourhoods. Baltimore is officially divided into nine geographical regions: North, Northeast, East, Southeast, South, Southwest, West, Northwest, and Central.
Neighborhood = Name of the neighborhood in which the crime was committed. Most names matches with neighborhood names contained in the dataset about Community Statistical Areas.
Latitude = Latitude, Coordinate system: EPSG:4326 WGS 84
Longitude = Longitude, Coordinate system: EPSG:4326 WGS 84
GeoLocation = Combination of latitude and longitude, Coordinate system: EPSG:4326 WGS
Premise = Information on the premise where the crime was committed. One counts more than 120’000 observations in the streets.
Source of the data set: [https://data.baltimorecity.gov/datasets/part1-crime-data/explore]
This dataset represents closed circuit camera locations capturing activity within 256ft (~2 blocks). It contains 837 observations in total.
X = Longitude: Coordinate system: EPSG:3857 WGS 84 / Pseudo-Mercator
Y = Latitude: Coordinate system: EPSG:3857 WGS 84 / Pseudo-Mercator
OBJECTID = ID of of the camera, 837 in total
CAM_NUM = Unique number attributed to the camera. This might suggest that the dataset does not show the location of every camera in Baltimore. Here at this point we want to mentioned that the CAM_NUM column has many zeros, which we couldn’t relate to anything. So we are still in the process of figuring out the exact meaning of that.
LOCATION = Textual information on where the camera is located
PROJ = Name of the area in which the camera is located. It does not always match the name of the “standard” community statistical areas.
XCCORD = Longitude, Coordinate system: EPSG:4326 WGS 84
YCOORD = Latitude, Coordinate system: EPSG:4326 WGS 84
Source of the data set: [https://data.baltimorecity.gov/datasets/cctv-locations-crime-cameras/explore]
This dataset provides information about the percent of family households living below the poverty line. This indicator measures the percentage of households whose income fell below the poverty threshold out of all households in an area.
Federal and state governments use such estimates to allocate funds to local communities. Local communities use these estimates to identify the number of individuals or families eligible for various programs. These information will be useful for us to study the dispersion of CCTVs within Baltimore in comparison to the poverty level in a given area. This dataset contains 55 observations, one percentage for each community statistical area. There seems to only be one NA. The most relevant variables are the following:
CSA2010 = name of the community statistical area. The Baltimore Data Collaborative and the Baltimore City Department of Planning divided Baltimore into 55 CSAs. These 55 units combine Census Bureau geographies together in ways that match Baltimore’s understanding of community boundaries, and are used in social planning.
hhpov15 - hhpov19 = each these five column contains the percent of Family Households Living Below the Poverty Line for a given year, from 2015 to 2019.
Shape_Area - Shape_Length = standard fields to determine the area and the perimeter of a polygon
Source of the data set: [https://data.baltimorecity.gov/datasets/bniajfi::percent-of-family-households-living-below-the-poverty-line-community-statistical-area/explore]
This dataset provides information about the Community Statistical Area geographies for Baltimore City. Based on aggregations of Census tract (2010) geographies. It will serve as a geographical point of reference for us to match each dataset’s observations together. This dataset contains 55 observations, one for each of area. The most relevant variables are the following:
community = name of the community statistical area. The Baltimore Data Collaborative and the Baltimore City Department of Planning divided Baltimore into 55 CSAs. These 55 units combine Census Bureau geographies together in ways that match Baltimore’s understanding of community boundaries, and are used in social planning.
neigh = name of the neighbourhoods contained in the area.
tracts = census tract associated with each neighbourhood. An interactive map of neighborhood statistical areas with census tracts is available online (http://planning.baltimorecity.gov/sites/default/files/Neighborhood%20Statistical%20Areas%20with%20Census%20Tracts.pdf?__cf_chl_captcha_tk__=pm d_5qD.WnCEfWnEa5h1muEPfTVDhN2uheRFagwmglbtKxg-1634299783-0-gqNtZGzNAzujcnBszQO9).
area_data <- read_csv(file = here::here("data/Community_Statistical_Areas__CSAs___Reference_Boundaries.csv"))Source of the data set: [https://data.baltimorecity.gov/datasets/community-statistical-area-1/explore?location=39.284605%2C-76.620550%2C12.26]
Here main goal is the transformation of the area data into a new dataset, which contains one observation per names of neighborhoods. We achieve that by first creating a new dataset with each neighbourhood being assigned to an area and second establishing a new columns with lower case letter for later merge.
area_data2 <- separate_rows(area_data, Neigh, sep = ", ") #Creation of a new dataset with each neighbourhood being assigned to an area
area_data2 <- mutate(area_data2,neigh=tolower(Neigh)) #Creation of new column with lower case lettersSince in the crime dataset the neighborhood names are written in lower case letters we again create a colums with lower case letters to join the two datasets. Next, we use the \(anti_join function\) to understand which observation has not matched. The outcome shows all the neighborhoods which did not match. These include using the names function:
#> [1] "mount washington"
#> [2] "carroll - camden industrial area"
#> [3] "patterson park neighborhood"
#> [4] "glenham-belhar"
#> [5] "new southwest/mount clare"
#> [6] ""
#> [7] "mount winans"
#> [8] "rosemont homeowners/tenants"
#> [9] "broening manor"
#> [10] "boyd-booth"
#> [11] "lower herring run park"
#> [12] "mt pleasant park"
We get rid of the 764 remaining observations which had no information about neighbourhood. Finally, we use the \(semi join function\) to create the final datasets which in total is basically the same dataset as the original one minus the 764 observations.
To see the structure of the dataset we use the \(str function\) and filter for the dates from 2000 (since the Baltimore CCTV program started in the year 2000).
#> 'data.frame': 349530 obs. of 25 variables:
#> $ X : num 1421661 1428630 1429982 1433589 1421304 ...
#> $ Y : num 593584 592267 593694 590797 591033 ...
#> $ RowID : int 1 2 3 4 5 6 7 8 9 10 ...
#> $ CrimeDateTime : chr "2021/09/24 08:00:00+00" "2021/09/23 02:0"..
#> $ CrimeCode : chr "6D" "6D" "6J" "6J" ...
#> $ Location : chr "500 SAINT PAUL ST APT 118" "0 N WASHINGT"..
#> $ Description : chr "LARCENY FROM AUTO" "LARCENY FROM AUTO" ""..
#> $ Inside_Outside : chr "" "" "" "" ...
#> $ Weapon : chr NA NA NA NA ...
#> $ Post : chr "124" "212" "221" "225" ...
#> $ District : chr "CENTRAL" "SOUTHEAST" "SOUTHEAST" "SOUTHE"..
#> $ Neighborhood : chr "MOUNT VERNON" "BUTCHER'S HILL" "MCELDERR"..
#> $ Latitude : num 39.3 39.3 39.3 39.3 39.3 ...
#> $ Longitude : num -76.6 -76.6 -76.6 -76.6 -76.6 ...
#> $ GeoLocation : chr "(39.2959,-76.6137)" "(39.2922,-76.5891)""..
#> $ Premise : chr "" "" "" "" ...
#> $ VRIName : chr "" "" "" "" ...
#> $ Total_Incidents: int 1 1 1 1 1 1 1 1 1 1 ...
#> $ Shape : logi NA NA NA NA NA NA ...
#> $ neigh : chr "mount vernon" "butcher's hill" "mcelderr"..
#> $ FID : num 55 16 31 26 14 32 13 26 28 41 ...
#> $ Community : chr "Midtown" "Fells Point" "Madison/East End"..
#> $ Neigh : chr "Mount Vernon" "Butcher's Hill" "McElderr"..
#> $ Tracts : chr "110100, 110200, 140100, 120500" "020200,"..
#> $ Link : chr "http://bniajfi.org/community/Midtown/" ""..
In Baltimore there are 56 areas in the standard community statistical area. However, within these 56 statistical areas is also jail included. For the poverty data however, we obviously have only 55 statistical areas given, since we obviously do not have data about poverty in jail. To solve this dissonanz, we add a new line. Moreover we needed to fill a missing value for Baltimore in the year 2019: Here we took the average of the past years.
Here we need to make sure to not have any missing values in the CCTV dataset.
which(is.na(cctv_data\(X)) which(is.na(cctv_data\)Y)) filter(cctv_data, X=="“) filter(cctv_data, ==”")
#I don’t know if it is the proper technique but by doing so I ensure that we have no NAs neihter empty values and so that our dataset is tidy
The original CCTV dataset which we observed had a slight challenge. Although it contained the neighborhood names were not matching the standard neighborhood names. Concludingly, to solve that we involved geospatial counting.
Our procedure included the following steps. After reading the table and converting the data into a data table, we define what will be the coordinates in the datasets. Here we have several types of coordinates, and we use x and Y. Those coordinates files have an special object included called crs. Crs is basically the coordinate system which is used. We continue by defining an object \(crs.geo1\) as being the coordinate system which is being used for all our files. Next, we have the \(proj4string\) function, to which we assign this crs.geo1 data.
#read in data table
balt_dat <- fread(file = here::here("data/Baltimore_CCTV_Locations_Crime_Cameras.csv"))
#convert to data table
balt_dat <- as.data.table(balt_dat)
#make data spatial
coordinates(balt_dat) <- c("X","Y")
crs.geo1 <- CRS("+proj=merc +a=6378137 +b=6378137 +lat_ts=0 +lon_0=0 +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null +wktext +no_defs +type=crs")
#> Warning in showSRID(uprojargs, format = "PROJ", multiline =
#> "NO", prefer_proj = prefer_proj): Discarded ellps WGS 84 in Proj4
#> definition: +proj=merc +a=6378137 +b=6378137 +lat_ts=0 +lon_0=0
#> +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null +wktext +no_defs
#> +type=crs
#> Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO",
#> prefer_proj = prefer_proj): Discarded datum World Geodetic System
#> 1984 in Proj4 definition
proj4string(balt_dat) <- crs.geo1 Then we plot to see the output (as cloud of points which represent all the CCTVs).
Next, we have to work with the shapefile which is another special file. Basically it is a set of polygons which represent different areas of the city Baltimore. We downloaded this file on the Open Baltimore Portal, read it in and assign this file again to our crs.geo1 coordinate system. In this way we have assured that our files have the same coordinate system.
#read in shapefile of baltimore
baltimore <- readOGR(dsn = here::here("data/Community_Statistical_Area"), layer = "Community_Statistical_Area") #name of file and object
#> Warning in OGRSpatialRef(dsn, layer, morphFromESRI = morphFromESRI,
#> dumpSRS = dumpSRS, : Discarded ellps WGS 84 in Proj4 definition:
#> +proj=merc +a=6378137 +b=6378137 +lat_ts=0 +lon_0=0 +x_0=0 +y_0=0
#> +k=1 +units=m +nadgrids=@null +wktext +no_defs
#> Warning in OGRSpatialRef(dsn, layer, morphFromESRI = morphFromESRI,
#> dumpSRS = dumpSRS, : Discarded datum WGS_1984 in Proj4 definition:
#> +proj=merc +a=6378137 +b=6378137 +lat_ts=0 +lon_0=0 +x_0=0 +y_0=0
#> +k=1 +units=m +nadgrids=@null +wktext +no_defs
#> Warning in showSRID(wkt2, "PROJ"): Discarded ellps WGS 84 in Proj4
#> definition: +proj=merc +a=6378137 +b=6378137 +lat_ts=0 +lon_0=0
#> +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null +wktext +no_defs
#> +type=crs
#> Warning in showSRID(wkt2, "PROJ"): Discarded datum World Geodetic
#> System 1984 in Proj4 definition
#> OGR data source with driver: ESRI Shapefile
#> Source: "/Users/Augustin/Documents/GitHub/DSBA-Group10/dsfba_project/data/Community_Statistical_Area", layer: "Community_Statistical_Area"
#> with 56 features
#> It has 12 fields
proj4string(baltimore) <- crs.geo1
#> Warning in proj4string(obj): CRS object has comment, which is lost in
#> output
#> Warning in `proj4string<-`(`*tmp*`, value = new("CRS", projargs = "+proj=merc +a=6378137 +b=6378137 +lat_ts=0 +lon_0=0 +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null +wktext +no_defs")): A new CRS was assigned to an object with an existing CRS:
#> +proj=merc +a=6378137 +b=6378137 +lat_ts=0 +lon_0=0 +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null +wktext +no_defs +type=crs
#> without reprojecting.
#> For reprojection, use function spTransformAgain we plot the results.
#plot
plot(baltimore,main="Spread of CCTVs in different communities of Baltimore")
plot(balt_dat,pch=20, col="steelblue" , add=TRUE) #If we plot these two lines together, what we obtain is a map od baltimore, we the 56 community statistical areas and the CCTVs on top of the map.To illustrate these results verbally, we need R to count for us how many CCTV belongs to which area. Here, the function \(over\) counts how many CCTVs are layed over a certain polygon frame. Next, we create a new object calles counts, make it into a dataframe (so that it is easier for us to work with it) and using the \(sum(countsFreq)\) to ensure that we have 56 observations in total. From the results we see that we have 41 observations, so there are only 41 out of 56 areas where there are some CCTV.
#Perform the count
proj4string(balt_dat)
#> Warning in proj4string(balt_dat): CRS object has comment, which is
#> lost in output
#> [1] "+proj=merc +a=6378137 +b=6378137 +lat_ts=0 +lon_0=0 +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null +wktext +no_defs"
proj4string(baltimore) #To be able to perform the count, we must ensure that the two spatial files have a similar CRS. This is the case as we attributed these two files "crs.geo1"
#> Warning in proj4string(baltimore): CRS object has comment, which is
#> lost in output
#> [1] "+proj=merc +a=6378137 +b=6378137 +lat_ts=0 +lon_0=0 +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null +wktext +no_defs"
res2 <- over(balt_dat,baltimore) #This function tells you to which community each CCTV belongs to
counts <- table(res2$community)
counts <- as.data.frame(counts)
colnames(counts)[1] <- "Community"
sum(counts$Freq) #We see that we have 836 observation in total, this is a good sign as our initial CCTV data set contained 836 obesrvations
#> [1] 836To make that workable, we need to create a new CCTV file, from which we just add 0 to each N.A.-location. Lastly, we create a new column with the \(mutate\)-function to calculate the CCTV-density which shows the amount of CCTV per area divided by the total amount of CCTV.
CCTV_per_area <- area_data[2] %>%
left_join(counts,by="Community") #One must add the communities where there are no counts i.e no CCTV
CCTV_per_area[is.na(CCTV_per_area)] <- 0
CCTV_per_area <- mutate(CCTV_per_area, density_perc=(CCTV_per_area$Freq/(sum(CCTV_per_area$Freq)))*100)Here we use the piping operator to ensure that the community that we have in the Baltimore dataset are the same as the one we are having in the CCTV per are dataset. As this only returns true values that means that it works and is good for further analysis.
#> [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [14] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [27] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [40] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [53] TRUE TRUE TRUE TRUE
Next, we perform a \(left join\) between the Baltimore dataset and the CCTV per are. To hedge against the different writing styles (one time it is written with a capital letter and one time with a small letter), we use the vector in the end. Finally, we create the map with the \(tmap package\). The \(tmap\) package works as the \(ggplot2\) package: First, we need to define an element, it always starts with the tm_shape argument, and then you can add with the plus operator the as many argument as you wish. We used the Baltimore-datasets, filled it with the density percentage, defined some breaks, set the borders and the finally the layout.
What we create is the crime_rate_per_area. To achieve that we grouo and summarize the crime data per community which enables us to compute the crime rate per area for each area. Again, we added one more row in the calculations because we have no values for the prison. Again, we ensured us by adding up everything to that it equates 100, which seems to help us go further confidently.
#> [1] 100
Again, we map the crimes similarily to the section of mapping the CCTVs.
library(tmap)
baltimore$community %in% CrimeRatePerArea$Community #We see that we have a perfect match
#> [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [14] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [27] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [40] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [53] TRUE TRUE TRUE TRUE
baltimore@data <- left_join(baltimore@data, CrimeRatePerArea, by = c('community' = 'Community'))
Crime_map <- tm_shape(baltimore) + tm_fill(col = "CrimeRatePerArea", title ="Crime rate per Area in %",style = "quantile") + tm_borders(col="black",alpha=0.3) + tm_layout(inner.margins = 0.05)Again, we use the \(tm - package\) together with the \(carogram - ncont\) function which basically distort the map to show our results. Concretely, we want to show that the crime rate is higher in the city center. This can be shown quite neatly graphically.
First thing we do here is to compute the unique values of the description column of the crime date with the area-dataset. We see that we have 14 types of crime. We want to observe crimes by types, therefore we want to make new classifications. The law consists of three basic classifications of criminal offenses including infractions, misdemeanors, and felonies. In our data set, we have no (?) infractions.
#> [1] "LARCENY FROM AUTO" "LARCENY"
#> [3] "HOMICIDE" "AUTO THEFT"
#> [5] "COMMON ASSAULT" "AGG. ASSAULT"
#> [7] "BURGLARY" "ROBBERY - COMMERCIAL"
#> [9] "RAPE" "ROBBERY - STREET"
#> [11] "SHOOTING" "ROBBERY - CARJACKING"
#> [13] "ARSON" "ROBBERY - RESIDENCE"
Next we create a dataset which is called crime_cat and basically tells you which recorded crime type belongs to which crime category. This dataset will be used later to make a left joint with the crime_data_per_area. Finally, we are left with the crime datasets with the area dataset with a new colums which concerns whether the crime was a felony or a misdemeanor.
#> [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [14] TRUE
Next, we compute the Crime_PerCategory_PerArea. Here we use the piping operator and this time we group_by the community and category and obtain the results. Again, we check that we indeed have 349482 observations. Moreover, from that we compute some felonystats and misdemeanorstats by (again) adding the prison line into the dataset.
CrimePerCategoryPerArea <- crime_data_with_areas %>%
group_by(Community,Category) %>%
summarize(RepartitionPerCategoryPerArea=n())
sum(CrimePerCategoryPerArea$RepartitionPerCategoryPerArea) #Again, we check that we indeed have 349482 observations
#> [1] 349482
CrimeCategoryRepartition <- CrimePerCategoryPerArea %>%
group_by(Category) %>%
summarise(Repartition=sum(RepartitionPerCategoryPerArea)) #We observe that in Baltimore, the number of felony is close to the number of misdemeanor
FelonyStats <- CrimePerCategoryPerArea %>% filter(Category=="Felony") %>%
mutate(FelonyRatePerArea = (RepartitionPerCategoryPerArea/CrimeCategoryRepartition$Repartition[1])*100)
FelonyStats[56,] <- list("Unassigned -- Jail","Felony",0,0)
MisdemeanorStats <- CrimePerCategoryPerArea %>% filter(Category=="Misdemeanor") %>%
mutate(MisdemeanorRatePerArea = (RepartitionPerCategoryPerArea/CrimeCategoryRepartition$Repartition[2])*100)
MisdemeanorStats[56,] <- list("Unassigned -- Jail","Misdemeanor",0,0)After ensuring that we have a perfect match we perform a left joint and for felony and misdemeanor and map everything.
#Felony
baltimore$community %in% FelonyStats$Community
#> [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [14] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [27] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [40] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [53] TRUE TRUE TRUE TRUE
baltimore@data <- left_join(baltimore@data, FelonyStats, by = c('community' = 'Community'))
Felony_map <- tm_shape(baltimore) + tm_fill(col = "FelonyRatePerArea", title ="Felony rate per Area in %",style = "quantile") + tm_borders(col="black",alpha=0.3) + tm_layout(inner.margins = 0.05)
#Misdemeanor
baltimore$community %in% MisdemeanorStats$Community
#> [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [14] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [27] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [40] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [53] TRUE TRUE TRUE TRUE
baltimore@data <- left_join(baltimore@data, MisdemeanorStats, by = c('community' = 'Community'))
Misdemeanor_map <- tm_shape(baltimore) + tm_fill(col = "MisdemeanorRatePerArea", title ="Misdemeanor rate per Area in %",style = "quantile") + tm_borders(col="black",alpha=0.3) + tm_layout(inner.margins = 0.05)The idea is that we want the information about how crime evolved. Here we could have done a loop, however we have created a dataset for each year. The results are interesting. If we compare how many observations we have in each crime-per year datasets, we see that we have ~40.000ish cases a year exept from 2020 (which is due to COVID) and the year 2021 (which is not finished. We dont make any datasets for the year 2013 and below, because we see that we have not many observations which date prior to the year 2013.
Crime_in_2021 <- crime_data_with_areas %>% filter(CrimeDateTime >= as.Date("2021-01-01") & CrimeDateTime <= as.Date("2021-12-31"))
Crime_in_2020 <- crime_data_with_areas %>% filter(CrimeDateTime >= as.Date("2020-01-01") & CrimeDateTime <= as.Date("2020-12-31"))
Crime_in_2019 <- crime_data_with_areas %>% filter(CrimeDateTime >= as.Date("2019-01-01") & CrimeDateTime <= as.Date("2019-12-31"))
Crime_in_2018 <- crime_data_with_areas %>% filter(CrimeDateTime >= as.Date("2018-01-01") & CrimeDateTime <= as.Date("2018-12-31"))
Crime_in_2017 <- crime_data_with_areas %>% filter(CrimeDateTime >= as.Date("2017-01-01") & CrimeDateTime <= as.Date("2017-12-31"))
Crime_in_2016 <- crime_data_with_areas %>% filter(CrimeDateTime >= as.Date("2016-01-01") & CrimeDateTime <= as.Date("2016-12-31"))
Crime_in_2015 <- crime_data_with_areas %>% filter(CrimeDateTime >= as.Date("2015-01-01") & CrimeDateTime <= as.Date("2015-12-31"))
Crime_in_2014 <- crime_data_with_areas %>% filter(CrimeDateTime >= as.Date("2014-01-01") & CrimeDateTime <= as.Date("2014-12-31"))
crime_data_with_areas %>% filter(CrimeDateTime < as.Date("2014-01-01")) #We see that we have very few (76) observations before 2014, thus we do not consider them
#> X Y RowID CrimeDateTime CrimeCode
#> 1 1400764 588624 350195 2013-12-23 2A
#> 2 1439847 596145 350196 2013-12-12 2A
#> 3 1427923 598383 350197 2013-12-01 2A
#> 4 1426886 589783 350198 2013-12-01 2A
#> 5 1409676 589858 350199 2013-11-02 2A
#> 6 1441807 617244 350200 2013-11-01 2A
#> 7 1412675 597518 350201 2013-09-21 2A
#> 8 1428790 569212 350202 2013-08-01 2A
#> 9 1395691 616617 350203 2013-07-30 2A
#> 10 1432920 570396 350204 2013-07-01 2A
#> 11 1410066 591462 350205 2013-07-01 2A
#> 12 1394633 620474 350206 2013-05-10 2A
#> 13 1415945 607475 350207 2013-02-03 2A
#> 14 1436009 605633 350208 2013-01-01 2A
#> 15 1405178 612679 350209 2013-01-01 2A
#> 16 1425270 570945 350210 2013-01-01 2A
#> 17 1442250 591021 350211 2013-01-01 2A
#> 18 1420453 605016 350212 2012-10-20 2A
#> 19 1400657 586839 350213 2012-10-15 2A
#> 20 1425546 591889 350214 2012-10-01 2A
#> 21 1434795 617173 350215 2012-09-03 2A
#> 22 1424501 571925 350216 2012-07-01 2A
#> 23 1405463 611915 350217 2012-06-19 6G
#> 24 1435056 591823 350218 2012-06-01 2A
#> 25 1409608 592662 350219 2012-06-01 2A
#> 26 1398949 614770 350220 2012-05-01 2A
#> 27 1442126 587451 350221 2012-05-01 2A
#> 28 1421945 593367 350222 2012-03-02 2A
#> 29 1432495 600881 350223 2012-01-01 2A
#> 30 1425217 569925 350224 2012-01-01 2A
#> 31 1443625 587786 350225 2012-01-01 2A
#> 32 1420291 576096 350226 2011-11-23 2B
#> 33 1411711 590667 350227 2011-11-13 2A
#> 34 1409219 598488 350228 2011-10-01 2A
#> 35 1430895 598069 350229 2011-07-01 2A
#> 36 1428112 594159 350230 2011-06-26 4E
#> 37 1415406 593850 350231 2011-06-01 2A
#> 38 1437197 588119 350232 2011-06-01 2A
#> 39 1411761 577555 350233 2011-06-01 2B
#> 40 1396849 608534 350234 2011-05-01 2A
#> 41 1440774 620153 350235 2011-04-01 2A
#> 42 1419410 597400 350236 2011-01-12 2A
#> 43 1404399 601130 350237 2011-01-01 2A
#> 44 1414412 594648 350238 2011-01-01 2A
#> 45 1414383 594720 350239 2010-04-10 2A
#> 46 1430925 597814 350240 2010-01-01 2A
#> 47 1428754 589755 350241 2010-01-01 2A
#> 48 1421791 596463 350242 2009-11-09 2A
#> 49 1413014 597520 350243 2009-06-06 2A
#> 50 1416144 614433 350244 2009-06-01 2A
#> 51 1434793 605627 350245 2009-04-12 2A
#> 52 1437263 603817 350246 2009-02-01 2A
#> 53 1429534 598828 350247 2009-01-01 2A
#> 54 1435486 620491 350248 2009-01-01 2A
#> 55 1423164 599637 350249 2008-09-26 2A
#> 56 1415165 597492 350250 2008-09-17 2A
#> 57 1411572 589938 350251 2008-06-24 2A
#> 58 1426361 612472 350252 2008-02-01 2A
#> 59 1442345 616992 350253 2008-01-01 2A
#> 60 1428796 611901 350254 2008-01-01 2A
#> 61 1423348 596214 350255 2007-09-23 2A
#> 62 1427303 597798 350256 2007-09-04 2A
#> 63 1413927 588563 350257 2007-08-26 2A
#> 64 1444493 589648 350258 2007-04-21 4C
#> 65 1417432 575792 350259 2007-01-14 2B
#> 66 1427160 617211 350260 2007-01-01 2A
#> 67 1414219 593336 350261 2006-02-27 2A
#> 68 1414276 593154 350262 2004-04-01 2A
#> 69 1433218 609590 350263 2004-01-01 2A
#> 70 1413150 577196 350264 2003-01-01 2A
#> 71 1405466 611187 350265 2001-10-01 2A
#> 72 1420386 587169 350266 2001-05-01 2A
#> 73 1424562 570905 350267 2001-01-01 2A
#> 74 1438750 617775 350268 2000-05-23 2A
#> 75 1420754 600647 350269 2000-01-08 2A
#> 76 1428574 604688 350270 2000-01-01 2A
#> Location Description Inside_Outside Weapon
#> 1 4300 ADELLE TER RAPE I OTHER
#> 2 900 SPANGLER WAY RAPE I OTHER
#> 3 1600 N WOLFE ST RAPE I OTHER
#> 4 500 S BOND ST RAPE I OTHER
#> 5 2500 W LOMBARD ST RAPE I OTHER
#> 6 3400 NORTHWAY DR RAPE I OTHER
#> 7 1600 BRUCE CT RAPE I OTHER
#> 8 4000 PENNINGTON AVE RAPE I OTHER
#> 9 3900 CLARINTH RD RAPE I OTHER
#> 10 3500 8TH AVE RAPE OTHER
#> 11 2400 W LEXINGTON ST RAPE I OTHER
#> 12 7300 PARK HEIGHTS AVE RAPE I OTHER
#> 13 1000 W 38TH ST RAPE I OTHER
#> 14 4400 ASBURY AVE RAPE OTHER
#> 15 3000 W GARRISON AVE RAPE I OTHER
#> 16 3800 BROOKLYN AVE RAPE I OTHER
#> 17 400 GUSRYAN ST RAPE OTHER
#> 18 3200 N CHARLES ST RAPE I OTHER
#> 19 4300 PARKTON ST RAPE I OTHER
#> 20 1100 E BALTIMORE ST RAPE I OTHER
#> 21 6200 TRAMORE RD RAPE I OTHER
#> 22 3600 5TH ST RAPE I OTHER
#> 23 4800 PALMER AVE LARCENY Outside <NA>
#> 24 3700 MOUNT PLEASANT AVE RAPE OTHER
#> 25 2500 W FRANKLIN ST RAPE O OTHER
#> 26 3800 MENLO DR RAPE I OTHER
#> 27 6200 TOONE ST RAPE I OTHER
#> 28 500 N CALVERT ST RAPE I OTHER
#> 29 3400 ELMORA AVE RAPE I OTHER
#> 30 4100 8TH ST RAPE I OTHER
#> 31 1200 STEELTON AVE RAPE I OTHER
#> 32 3200 GULFPORT DR RAPE I OTHER
#> 33 2000 W BALTIMORE ST RAPE I OTHER
#> 34 2500 W NORTH AVE RAPE OTHER
#> 35 2600 E OLIVER ST RAPE I OTHER
#> 36 600 N WOLFE ST COMMON ASSAULT Outside <NA>
#> 37 1100 HARLEM AVE RAPE I OTHER
#> 38 4400 ODONNELL ST RAPE I OTHER
#> 39 3100 SAVOY ST RAPE I OTHER
#> 40 3800 N ROGERS AVE RAPE I OTHER
#> 41 3000 HARVIEW AVE RAPE I OTHER
#> 42 1200 JOHN ST RAPE I OTHER
#> 43 900 DENISON RAPE I OTHER
#> 44 1300 W LAFAYETTE AVE RAPE OTHER
#> 45 1300 W LAFAYETTE AVE RAPE I OTHER
#> 46 2600 LLEWELYN AVE RAPE I OTHER
#> 47 500 S WASHINGTON ST RAPE I OTHER
#> 48 1200 N CALVERT ST RAPE I OTHER
#> 49 1600 VINCENT CT RAPE I OTHER
#> 50 700 WYNDHURST AVE RAPE OTHER
#> 51 4100 PARKSIDE DR RAPE I OTHER
#> 52 4600 SHAMROCK AVE RAPE I OTHER
#> 53 1700 N PATTERSON PARK AVE RAPE I OTHER
#> 54 2400 PICKERING DR RAPE OTHER
#> 55 700 E 20TH ST RAPE I OTHER
#> 56 500 PRESSTMAN ST RAPE I OTHER
#> 57 2000 FREDERICK AVE RAPE I OTHER
#> 58 4600 MARBLE HALL RD RAPE I OTHER
#> 59 3500 NORTHWAY DR RAPE I OTHER
#> 60 1600 COLD SPRING LN RAPE I OTHER
#> 61 1100 GREENMOUNT AVE RAPE I OTHER
#> 62 1700 E OLIVER ST RAPE I OTHER
#> 63 1500 COLE ST RAPE I OTHER
#> 64 6800 FAIT AVE AGG. ASSAULT OTHER
#> 65 2700 CLAFLIN CT RAPE I OTHER
#> 66 1200 E BELVEDERE AVE RAPE I OTHER
#> 67 1400 EDMONDSON AVE RAPE OTHER
#> 68 500 N CALHOUN ST RAPE OTHER
#> 69 4400 HARFORD RD RAPE I OTHER
#> 70 2100 W PATAPSCO AVE RAPE OTHER
#> 71 4700 PARK HEIGHTS AVE RAPE I OTHER
#> 72 200 W HAMBURG ST RAPE O OTHER
#> 73 3800 6TH ST RAPE I OTHER
#> 74 6500 HARFORD RD RAPE OTHER
#> 75 2300 N CHARLES ST RAPE OTHER
#> 76 1900 E 31ST ST RAPE I OTHER
#> Post District Neighborhood Latitude Longitude
#> 1 824 SOUTHWEST IRVINGTON 39.3 -76.7
#> 2 433 NORTHEAST ARMISTEAD GARDENS 39.3 -76.5
#> 3 331 EASTERN BROADWAY EAST 39.3 -76.6
#> 4 213 SOUTHEAST FELLS POINT 39.3 -76.6
#> 5 835 SOUTHWEST SHIPLEY HILL 39.3 -76.7
#> 6 424 NORTHEAST NORTH HARFORD ROAD 39.4 -76.5
#> 7 734 WESTERN SANDTOWN-WINCHESTER 39.3 -76.6
#> 8 911 SOUTHERN CURTIS BAY 39.2 -76.6
#> 9 631 NORTHWEST FALLSTAFF 39.4 -76.7
#> 10 912 SOUTHERN FAIRFIELD AREA 39.2 -76.6
#> 11 714 WESTERN PENROSE/FAYETTE STREET OUTREACH 39.3 -76.7
#> 12 631 NORTHWEST CROSS COUNTRY 39.4 -76.7
#> 13 531 NORTHERN HAMPDEN 39.3 -76.6
#> 14 442 NORTHEAST BELAIR-EDISON 39.3 -76.6
#> 15 614 NORTHWEST CENTRAL PARK HEIGHTS 39.3 -76.7
#> 16 913 SOUTHERN BROOKLYN 39.2 -76.6
#> 17 232 SOUTHEAST BAYVIEW 39.3 -76.5
#> 18 511 NORTHERN JOHNS HOPKINS HOMEWOOD 39.3 -76.6
#> 19 833 SOUTHWEST YALE HEIGHTS 39.3 -76.7
#> 20 211 SOUTHEAST JONESTOWN 39.3 -76.6
#> 21 423 NORTHEAST HAMILTON HILLS 39.4 -76.6
#> 22 913 SOUTHERN BROOKLYN 39.2 -76.6
#> 23 614 NORTHWEST CENTRAL PARK HEIGHTS 39.3 -76.7
#> 24 223 SOUTHEAST BALTIMORE HIGHLANDS 39.3 -76.6
#> 25 721 WESTERN ROSEMONT HOMEOWNERS/TENANTS 39.3 -76.7
#> 26 632 NORTHWEST GLEN 39.4 -76.7
#> 27 233 SOUTHEAST O'DONNELL HEIGHTS 39.3 -76.5
#> 28 124 CENTRAL MOUNT VERNON 39.3 -76.6
#> 29 434 NORTHEAST FOUR BY FOUR 39.3 -76.6
#> 30 913 SOUTHERN BROOKLYN 39.2 -76.6
#> 31 234 SOUTHEAST GRACELAND PARK 39.3 -76.5
#> 32 922 SOUTHERN CHERRY HILL 39.2 -76.6
#> 33 714 WESTERN BOYD-BOOTH 39.3 -76.6
#> 34 731 WESTERN MONDAWMIN 39.3 -76.7
#> 35 332 EASTERN BEREA 39.3 -76.6
#> 36 321 EASTERN DUNBAR-BROADWAY 39.3 -76.6
#> 37 713 WESTERN HARLEM PARK 39.3 -76.6
#> 38 233 SOUTHEAST CANTON INDUSTRIAL AREA 39.3 -76.6
#> 39 923 SOUTHERN LAKELAND 39.3 -76.6
#> 40 634 NORTHWEST GROVE PARK 39.3 -76.7
#> 41 424 NORTHEAST NORTH HARFORD ROAD 39.4 -76.5
#> 42 132 CENTRAL BOLTON HILL 39.3 -76.6
#> 43 624 NORTHWEST HANLON-LONGWOOD 39.3 -76.7
#> 44 724 WESTERN HARLEM PARK 39.3 -76.6
#> 45 724 WESTERN SANDTOWN-WINCHESTER 39.3 -76.6
#> 46 332 EASTERN BEREA 39.3 -76.6
#> 47 213 SOUTHEAST FELLS POINT 39.3 -76.6
#> 48 134 CENTRAL MID-TOWN BELVEDERE 39.3 -76.6
#> 49 734 WESTERN SANDTOWN-WINCHESTER 39.3 -76.6
#> 50 521 NORTHERN WYNDHURST 39.4 -76.6
#> 51 422 NORTHEAST BELAIR-PARKSIDE 39.3 -76.6
#> 52 442 NORTHEAST PARKSIDE 39.3 -76.6
#> 53 331 EASTERN BROADWAY EAST 39.3 -76.6
#> 54 423 NORTHEAST HAMILTON HILLS 39.4 -76.6
#> 55 312 EASTERN EAST BALTIMORE MIDWAY 39.3 -76.6
#> 56 123 CENTRAL UPTON 39.3 -76.6
#> 57 835 SOUTHWEST BOYD-BOOTH 39.3 -76.6
#> 58 413 NORTHEAST NEW NORTHWOOD 39.3 -76.6
#> 59 424 NORTHEAST NORTH HARFORD ROAD 39.4 -76.5
#> 60 413 NORTHEAST STONEWOOD-PENTWOOD-WINSTON 39.3 -76.6
#> 61 313 EASTERN JOHNSTON SQUARE 39.3 -76.6
#> 62 331 EASTERN BROADWAY EAST 39.3 -76.6
#> 63 935 SOUTHERN NEW SOUTHWEST/MOUNT CLARE 39.3 -76.6
#> 64 234 SOUTHEAST GRACELAND PARK 39.3 -76.5
#> 65 922 SOUTHERN CHERRY HILL 39.2 -76.6
#> 66 414 NORTHEAST WOODBOURNE HEIGHTS 39.4 -76.6
#> 67 713 WESTERN HARLEM PARK 39.3 -76.6
#> 68 713 WESTERN HARLEM PARK 39.3 -76.6
#> 69 421 NORTHEAST BEVERLY HILLS 39.3 -76.6
#> 70 923 SOUTHERN LAKELAND 39.3 -76.6
#> 71 614 NORTHWEST CENTRAL PARK HEIGHTS 39.3 -76.7
#> 72 941 SOUTHERN SHARP-LEADENHALL 39.3 -76.6
#> 73 913 SOUTHERN BROOKLYN 39.2 -76.6
#> 74 424 NORTHEAST WESTFIELD 39.4 -76.6
#> 75 514 NORTHERN OLD GOUCHER 39.3 -76.6
#> 76 411 NORTHEAST COLDSTREAM HOMESTEAD MONTEBELLO 39.3 -76.6
#> GeoLocation Premise VRIName
#> 1 (39.2825,-76.6876) ROW/TOWNHOUSE-OCC
#> 2 (39.3027,-76.5494) ROW/TOWNHOUSE-OCC
#> 3 (39.309,-76.5915) ROW/TOWNHOUSE-OCC Eastern 1
#> 4 (39.2854,-76.5953) ROW/TOWNHOUSE-OCC
#> 5 (39.2858,-76.6561) COURT HOUSE
#> 6 (39.3606,-76.5421) ROW/TOWNHOUSE-OCC
#> 7 (39.3068,-76.6454) APT/CONDO - OCCUPIED Western
#> 8 (39.2289,-76.5889) ROW/TOWNHOUSE-OCC
#> 9 (39.3594,-76.7052) ROW/TOWNHOUSE-OCC
#> 10 (39.2321,-76.5743)
#> 11 (39.2902,-76.6547) ROW/TOWNHOUSE-OCC
#> 12 (39.37,-76.7089) ROW/TOWNHOUSE-OCC
#> 13 (39.3341,-76.6337) ROW/TOWNHOUSE-OCC
#> 14 (39.3288,-76.5628)
#> 15 (39.3485,-76.6717) ROW/TOWNHOUSE-OCC
#> 16 (39.2337,-76.6013) ROW/TOWNHOUSE-OCC Brooklyn
#> 17 (39.2886,-76.541)
#> 18 (39.3273,-76.6178) OTHER - INSIDE
#> 19 (39.2776,-76.688) ROW/TOWNHOUSE-OCC
#> 20 (39.2912,-76.6) ROW/TOWNHOUSE-OCC
#> 21 (39.3605,-76.5669) APT/CONDO - OCCUPIED
#> 22 (39.2364,-76.604) ROW/TOWNHOUSE-OCC Brooklyn
#> 23 (39.3464,-76.6707) OTHER/RESIDENTIAL
#> 24 (39.2909,-76.5664)
#> 25 (39.2935,-76.6563) STREET
#> 26 (39.3543,-76.6937) ROW/TOWNHOUSE-OCC
#> 27 (39.2788,-76.5415) ROW/TOWNHOUSE-OCC
#> 28 (39.2953,-76.6127) ROW/TOWNHOUSE-OCC
#> 29 (39.3158,-76.5753) ROW/TOWNHOUSE-OCC
#> 30 (39.2309,-76.6015) ROW/TOWNHOUSE-OCC
#> 31 (39.2797,-76.5362) ROW/TOWNHOUSE-OCC
#> 32 (39.2479,-76.6188) ROW/TOWNHOUSE-OCC
#> 33 (39.288,-76.6489) ROW/TOWNHOUSE-OCC Tri-District
#> 34 (39.3095,-76.6576)
#> 35 (39.3081,-76.581) ROW/TOWNHOUSE-OCC
#> 36 (39.2974,-76.5909) DRUG STORE / MED BL
#> 37 (39.2967,-76.6358) ROW/TOWNHOUSE-OCC Central
#> 38 (39.2807,-76.5589) OTHER - INSIDE
#> 39 (39.252,-76.6489) OTHER - INSIDE
#> 40 (39.3372,-76.7012) ROW/TOWNHOUSE-OCC
#> 41 (39.3686,-76.5457) ROW/TOWNHOUSE-OCC
#> 42 (39.3064,-76.6216) ROW/TOWNHOUSE-OCC
#> 43 (39.3168,-76.6746) ROW/TOWNHOUSE-OCC
#> 44 (39.2989,-76.6393) Central
#> 45 (39.2991,-76.6394) ROW/TOWNHOUSE-OCC Central
#> 46 (39.3074,-76.5809) ROW/TOWNHOUSE-OCC
#> 47 (39.2853,-76.5887) ROW/TOWNHOUSE-OCC
#> 48 (39.3038,-76.6132) ROW/TOWNHOUSE-OCC
#> 49 (39.3068,-76.6442) ROW/TOWNHOUSE-OCC Western
#> 50 (39.3532,-76.6329)
#> 51 (39.3288,-76.5671) ROW/TOWNHOUSE-OCC
#> 52 (39.3238,-76.5584) ROW/TOWNHOUSE-OCC
#> 53 (39.3102,-76.5858) ROW/TOWNHOUSE-OCC
#> 54 (39.3696,-76.5644)
#> 55 (39.3125,-76.6083) ROW/TOWNHOUSE-OCC
#> 56 (39.3067,-76.6366) ROW/TOWNHOUSE-OCC
#> 57 (39.286,-76.6494) ROW/TOWNHOUSE-OCC Tri-District
#> 58 (39.3477,-76.5968) ROW/TOWNHOUSE-OCC
#> 59 (39.3599,-76.5402) ROW/TOWNHOUSE-OCC
#> 60 (39.3461,-76.5882) ROW/TOWNHOUSE-OCC
#> 61 (39.3031,-76.6077) OTHER - INSIDE
#> 62 (39.3074,-76.5937) ROW/TOWNHOUSE-OCC Eastern 1
#> 63 (39.2822,-76.6411) ROW/TOWNHOUSE-VAC
#> 64 (39.2848,-76.5331)
#> 65 (39.2471,-76.6289) ROW/TOWNHOUSE-OCC
#> 66 (39.3607,-76.5939) ROW/TOWNHOUSE-OCC
#> 67 (39.2953,-76.64) Central
#> 68 (39.2948,-76.6398) Central
#> 69 (39.3397,-76.5726) ROW/TOWNHOUSE-OCC
#> 70 (39.251,-76.644)
#> 71 (39.3444,-76.6707) ROW/TOWNHOUSE-OCC
#> 72 (39.2783,-76.6183) BUS/AUTO
#> 73 (39.2336,-76.6038) ROW/TOWNHOUSE-OCC Brooklyn
#> 74 (39.3621,-76.5529)
#> 75 (39.3153,-76.6168)
#> 76 (39.3263,-76.5891) ROW/TOWNHOUSE-OCC
#> Total_Incidents Shape neigh FID
#> 1 1 NA irvington 1
#> 2 1 NA armistead gardens 9
#> 3 1 NA broadway east 10
#> 4 1 NA fells point 16
#> 5 1 NA shipley hill 47
#> 6 1 NA north harford road 25
#> 7 1 NA sandtown-winchester 43
#> 8 1 NA curtis bay 4
#> 9 1 NA fallstaff 18
#> 10 1 NA fairfield area 4
#> 11 1 NA penrose/fayette street outreach 47
#> 12 1 NA cross country 11
#> 13 1 NA hampden 32
#> 14 1 NA belair-edison 3
#> 15 1 NA central park heights 41
#> 16 1 NA brooklyn 4
#> 17 1 NA bayview 38
#> 18 1 NA johns hopkins homewood 19
#> 19 1 NA yale heights 1
#> 20 1 NA jonestown 53
#> 21 1 NA hamilton hills 25
#> 22 1 NA brooklyn 4
#> 23 1 NA central park heights 41
#> 24 1 NA baltimore highlands 38
#> 25 1 NA rosemont 23
#> 26 1 NA glen 18
#> 27 1 NA o'donnell heights 45
#> 28 1 NA mount vernon 55
#> 29 1 NA four by four 3
#> 30 1 NA brooklyn 4
#> 31 1 NA graceland park 45
#> 32 1 NA cherry hill 7
#> 33 1 NA booth-boyd 47
#> 34 1 NA mondawmin 21
#> 35 1 NA berea 10
#> 36 1 NA dunbar-broadway 52
#> 37 1 NA harlem park 43
#> 38 1 NA canton industrial area 45
#> 39 1 NA lakeland 50
#> 40 1 NA grove park 27
#> 41 1 NA north harford road 25
#> 42 1 NA bolton hill 55
#> 43 1 NA hanlon-longwood 21
#> 44 1 NA harlem park 43
#> 45 1 NA sandtown-winchester 43
#> 46 1 NA berea 10
#> 47 1 NA fells point 16
#> 48 1 NA mid-town belvedere 55
#> 49 1 NA sandtown-winchester 43
#> 50 1 NA wyndhurst 22
#> 51 1 NA belair-parkside 29
#> 52 1 NA parkside 6
#> 53 1 NA broadway east 10
#> 54 1 NA hamilton hills 25
#> 55 1 NA east baltimore midway 33
#> 56 1 NA upton 54
#> 57 1 NA booth-boyd 47
#> 58 1 NA new northwood 37
#> 59 1 NA north harford road 25
#> 60 1 NA stonewood-pentwood-winston 37
#> 61 1 NA johnston square 56
#> 62 1 NA broadway east 10
#> 63 1 NA hollins market 42
#> 64 1 NA graceland park 45
#> 65 1 NA cherry hill 7
#> 66 1 NA woodbourne heights 30
#> 67 1 NA harlem park 43
#> 68 1 NA harlem park 43
#> 69 1 NA beverly hills 29
#> 70 1 NA lakeland 50
#> 71 1 NA central park heights 41
#> 72 1 NA sharp-leadenhall 28
#> 73 1 NA brooklyn 4
#> 74 1 NA westfield 24
#> 75 1 NA old goucher 19
#> 76 1 NA coldstream homestead montebello 33
#> Community
#> 1 Allendale/Irvington/S. Hilton
#> 2 Claremont/Armistead
#> 3 Clifton-Berea
#> 4 Fells Point
#> 5 Southwest Baltimore
#> 6 Harford/Echodale
#> 7 Sandtown-Winchester/Harlem Park
#> 8 Brooklyn/Curtis Bay/Hawkins Point
#> 9 Glen-Fallstaff
#> 10 Brooklyn/Curtis Bay/Hawkins Point
#> 11 Southwest Baltimore
#> 12 Cross-Country/Cheswolde
#> 13 Medfield/Hampden/Woodberry/Remington
#> 14 Belair-Edison
#> 15 Pimlico/Arlington/Hilltop
#> 16 Brooklyn/Curtis Bay/Hawkins Point
#> 17 Orangeville/East Highlandtown
#> 18 Greater Charles Village/Barclay
#> 19 Allendale/Irvington/S. Hilton
#> 20 Harbor East/Little Italy
#> 21 Harford/Echodale
#> 22 Brooklyn/Curtis Bay/Hawkins Point
#> 23 Pimlico/Arlington/Hilltop
#> 24 Orangeville/East Highlandtown
#> 25 Greater Rosemont
#> 26 Glen-Fallstaff
#> 27 Southeastern
#> 28 Midtown
#> 29 Belair-Edison
#> 30 Brooklyn/Curtis Bay/Hawkins Point
#> 31 Southeastern
#> 32 Cherry Hill
#> 33 Southwest Baltimore
#> 34 Greater Mondawmin
#> 35 Clifton-Berea
#> 36 Oldtown/Middle East
#> 37 Sandtown-Winchester/Harlem Park
#> 38 Southeastern
#> 39 Westport/Mount Winans/Lakeland
#> 40 Howard Park/West Arlington
#> 41 Harford/Echodale
#> 42 Midtown
#> 43 Greater Mondawmin
#> 44 Sandtown-Winchester/Harlem Park
#> 45 Sandtown-Winchester/Harlem Park
#> 46 Clifton-Berea
#> 47 Fells Point
#> 48 Midtown
#> 49 Sandtown-Winchester/Harlem Park
#> 50 Greater Roland Park/Poplar Hill
#> 51 Lauraville
#> 52 Cedonia/Frankford
#> 53 Clifton-Berea
#> 54 Harford/Echodale
#> 55 Midway/Coldstream
#> 56 Upton/Druid Heights
#> 57 Southwest Baltimore
#> 58 Northwood
#> 59 Harford/Echodale
#> 60 Northwood
#> 61 Greenmount East
#> 62 Clifton-Berea
#> 63 Poppleton/The Terraces/Hollins Market
#> 64 Southeastern
#> 65 Cherry Hill
#> 66 Loch Raven
#> 67 Sandtown-Winchester/Harlem Park
#> 68 Sandtown-Winchester/Harlem Park
#> 69 Lauraville
#> 70 Westport/Mount Winans/Lakeland
#> 71 Pimlico/Arlington/Hilltop
#> 72 Inner Harbor/Federal Hill
#> 73 Brooklyn/Curtis Bay/Hawkins Point
#> 74 Hamilton
#> 75 Greater Charles Village/Barclay
#> 76 Midway/Coldstream
#> Neigh
#> 1 Irvington
#> 2 Armistead Gardens
#> 3 Broadway East
#> 4 Fells Point
#> 5 Shipley Hill
#> 6 North Harford Road
#> 7 Sandtown-Winchester
#> 8 Curtis Bay
#> 9 Fallstaff
#> 10 Fairfield Area
#> 11 Penrose/Fayette Street Outreach
#> 12 Cross Country
#> 13 Hampden
#> 14 Belair-Edison
#> 15 Central Park Heights
#> 16 Brooklyn
#> 17 Bayview
#> 18 Johns Hopkins Homewood
#> 19 Yale Heights
#> 20 Jonestown
#> 21 Hamilton Hills
#> 22 Brooklyn
#> 23 Central Park Heights
#> 24 Baltimore Highlands
#> 25 Rosemont
#> 26 Glen
#> 27 O'Donnell Heights
#> 28 Mount Vernon
#> 29 Four By Four
#> 30 Brooklyn
#> 31 Graceland Park
#> 32 Cherry Hill
#> 33 Booth-Boyd
#> 34 Mondawmin
#> 35 Berea
#> 36 Dunbar-Broadway
#> 37 Harlem Park
#> 38 Canton Industrial Area
#> 39 Lakeland
#> 40 Grove Park
#> 41 North Harford Road
#> 42 Bolton Hill
#> 43 Hanlon-Longwood
#> 44 Harlem Park
#> 45 Sandtown-Winchester
#> 46 Berea
#> 47 Fells Point
#> 48 Mid-Town Belvedere
#> 49 Sandtown-Winchester
#> 50 Wyndhurst
#> 51 Belair-Parkside
#> 52 Parkside
#> 53 Broadway East
#> 54 Hamilton Hills
#> 55 East Baltimore Midway
#> 56 Upton
#> 57 Booth-Boyd
#> 58 New Northwood
#> 59 North Harford Road
#> 60 Stonewood-Pentwood-Winston
#> 61 Johnston Square
#> 62 Broadway East
#> 63 Hollins Market
#> 64 Graceland Park
#> 65 Cherry Hill
#> 66 Woodbourne Heights
#> 67 Harlem Park
#> 68 Harlem Park
#> 69 Beverly Hills
#> 70 Lakeland
#> 71 Central Park Heights
#> 72 Sharp-Leadenhall
#> 73 Brooklyn
#> 74 Westfield
#> 75 Old Goucher
#> 76 Coldstream Homestead Montebello
#> Tracts
#> 1 280404, 200701, 200600, 200702, 200800, 250102
#> 2 260303, 260401, 260403, 260402
#> 3 080500, 080302, 080200, 080301, 080400
#> 4 020200, 020300, 020100, 010500
#> 5 200400, 200500, 190100, 200200, 190200, 200300, 200100, 190300
#> 6 270501, 270600, 270701, 270702, 270703
#> 7 150100, 160400, 160100, 160200, 150200, 160300
#> 8 250500, 250600, 250401, 250402
#> 9 280101, 272007, 271900, 272006
#> 10 250500, 250600, 250401, 250402
#> 11 200400, 200500, 190100, 200200, 190200, 200300, 200100, 190300
#> 12 272003, 272005, 272004
#> 13 130803, 130700, 130600, 120700, 130804, 130806
#> 14 260301, 080102, 080101, 260302
#> 15 271802, 271801, 271700
#> 16 250500, 250600, 250401, 250402
#> 17 260404, 260501, 260700
#> 18 120400, 120300, 120600, 120202, 120201
#> 19 280404, 200701, 200600, 200702, 200800, 250102
#> 20 030100, 030200
#> 21 270501, 270600, 270701, 270702, 270703
#> 22 250500, 250600, 250401, 250402
#> 23 271802, 271801, 271700
#> 24 260404, 260501, 260700
#> 25 150300, 160700, 160600, 150600, 160500
#> 26 280101, 272007, 271900, 272006
#> 27 260605, 260604
#> 28 110100, 110200, 140100, 120500
#> 29 260301, 080102, 080101, 260302
#> 30 250500, 250600, 250401, 250402
#> 31 260605, 260604
#> 32 250207, 250204, 250203
#> 33 200400, 200500, 190100, 200200, 190200, 200300, 200100, 190300
#> 34 150500, 150400, 150702, 150701
#> 35 080500, 080302, 080200, 080301, 080400
#> 36 100200, 060400, 070400, 280500, 080800
#> 37 150100, 160400, 160100, 160200, 150200, 160300
#> 38 260605, 260604
#> 39 250301, 250205
#> 40 280200, 280102
#> 41 270501, 270600, 270701, 270702, 270703
#> 42 110100, 110200, 140100, 120500
#> 43 150500, 150400, 150702, 150701
#> 44 150100, 160400, 160100, 160200, 150200, 160300
#> 45 150100, 160400, 160100, 160200, 150200, 160300
#> 46 080500, 080302, 080200, 080301, 080400
#> 47 020200, 020300, 020100, 010500
#> 48 110100, 110200, 140100, 120500
#> 49 150100, 160400, 160100, 160200, 150200, 160300
#> 50 271300, 271400, 271503
#> 51 270302, 270301, 270101, 270200, 270102
#> 52 260203, 260101, 260102, 260201, 260202
#> 53 080500, 080302, 080200, 080301, 080400
#> 54 270501, 270600, 270701, 270702, 270703
#> 55 090800, 090600, 090700
#> 56 170200, 140200, 140300, 170300
#> 57 200400, 200500, 190100, 200200, 190200, 200300, 200100, 190300
#> 58 090200, 270903, 270902, 270901
#> 59 270501, 270600, 270701, 270702, 270703
#> 60 090200, 270903, 270902, 270901
#> 61 080700, 090900, 100100, 080600
#> 62 080500, 080302, 080200, 080301, 080400
#> 63 180200, 180300, 180100
#> 64 260605, 260604
#> 65 250207, 250204, 250203
#> 66 270801, 270803, 270802
#> 67 150100, 160400, 160100, 160200, 150200, 160300
#> 68 150100, 160400, 160100, 160200, 150200, 160300
#> 69 270302, 270301, 270101, 270200, 270102
#> 70 250301, 250205
#> 71 271802, 271801, 271700
#> 72 220100, 240200, 240300, 230100, 230200
#> 73 250500, 250600, 250401, 250402
#> 74 270401, 270402, 270502
#> 75 120400, 120300, 120600, 120202, 120201
#> 76 090800, 090600, 090700
#> Link
#> 1 http://bniajfi.org/community/Allendale_Irvington_S.%20Hilton/
#> 2 http://bniajfi.org/community/Claremont_Armistead/
#> 3 http://bniajfi.org/community/Clifton-Berea/
#> 4 http://bniajfi.org/community/Fells%20Point/
#> 5 http://bniajfi.org/community/Southwest%20Baltimore/
#> 6 http://bniajfi.org/community/Harford_Echodale/
#> 7 http://bniajfi.org/community/Sandtown-Winchester_Harlem%20Park/
#> 8 http://bniajfi.org/community/Brooklyn_Curtis%20Bay_Hawkins%20Point/
#> 9 http://bniajfi.org/community/Glen-Fallstaff/
#> 10 http://bniajfi.org/community/Brooklyn_Curtis%20Bay_Hawkins%20Point/
#> 11 http://bniajfi.org/community/Southwest%20Baltimore/
#> 12 http://bniajfi.org/community/Cross-Country_Cheswolde/
#> 13 http://bniajfi.org/community/Medfield_Hampden_Woodberry_Remington/
#> 14 http://bniajfi.org/community/Belair-Edison/
#> 15 http://bniajfi.org/community/Pimlico_Arlington_Hilltop/
#> 16 http://bniajfi.org/community/Brooklyn_Curtis%20Bay_Hawkins%20Point/
#> 17 http://bniajfi.org/community/Orangeville_East%20Highlandtown
#> 18 http://bniajfi.org/community/Greater%20Charles%20Village_Barclay/
#> 19 http://bniajfi.org/community/Allendale_Irvington_S.%20Hilton/
#> 20 http://bniajfi.org/community/Harbor%20East_Little%20Italy/
#> 21 http://bniajfi.org/community/Harford_Echodale/
#> 22 http://bniajfi.org/community/Brooklyn_Curtis%20Bay_Hawkins%20Point/
#> 23 http://bniajfi.org/community/Pimlico_Arlington_Hilltop/
#> 24 http://bniajfi.org/community/Orangeville_East%20Highlandtown
#> 25 http://bniajfi.org/community/Greater%20Rosemont/
#> 26 http://bniajfi.org/community/Glen-Fallstaff/
#> 27 http://bniajfi.org/community/Southeastern
#> 28 http://bniajfi.org/community/Midtown/
#> 29 http://bniajfi.org/community/Belair-Edison/
#> 30 http://bniajfi.org/community/Brooklyn_Curtis%20Bay_Hawkins%20Point/
#> 31 http://bniajfi.org/community/Southeastern
#> 32 http://bniajfi.org/community/Cherry%20Hill
#> 33 http://bniajfi.org/community/Southwest%20Baltimore/
#> 34 http://bniajfi.org/community/Greater%20Mondawmin/
#> 35 http://bniajfi.org/community/Clifton-Berea/
#> 36 http://bniajfi.org/community/Oldtown_Middle%20East/
#> 37 http://bniajfi.org/community/Sandtown-Winchester_Harlem%20Park/
#> 38 http://bniajfi.org/community/Southeastern
#> 39 http://bniajfi.org/community/Westport_Mount%20Winans_Lakeland/
#> 40 http://bniajfi.org/community/Howard%20Park_West%20Arlington/
#> 41 http://bniajfi.org/community/Harford_Echodale/
#> 42 http://bniajfi.org/community/Midtown/
#> 43 http://bniajfi.org/community/Greater%20Mondawmin/
#> 44 http://bniajfi.org/community/Sandtown-Winchester_Harlem%20Park/
#> 45 http://bniajfi.org/community/Sandtown-Winchester_Harlem%20Park/
#> 46 http://bniajfi.org/community/Clifton-Berea/
#> 47 http://bniajfi.org/community/Fells%20Point/
#> 48 http://bniajfi.org/community/Midtown/
#> 49 http://bniajfi.org/community/Sandtown-Winchester_Harlem%20Park/
#> 50 http://bniajfi.org/community/Greater%20Roland%20Park_Poplar%20Hill/
#> 51 http://bniajfi.org/community/Lauraville/
#> 52 http://bniajfi.org/community/Cedonia_Frankford/
#> 53 http://bniajfi.org/community/Clifton-Berea/
#> 54 http://bniajfi.org/community/Harford_Echodale/
#> 55 http://bniajfi.org/community/Midway_Coldstream/
#> 56 http://bniajfi.org/community/Upton_Druid%20Heights/
#> 57 http://bniajfi.org/community/Southwest%20Baltimore/
#> 58 http://bniajfi.org/community/Northwood/
#> 59 http://bniajfi.org/community/Harford_Echodale/
#> 60 http://bniajfi.org/community/Northwood/
#> 61 http://bniajfi.org/community/Greenmount%20East/
#> 62 http://bniajfi.org/community/Clifton-Berea/
#> 63 http://bniajfi.org/community/Poppleton_The%20Terraces_Hollins%20Market/
#> 64 http://bniajfi.org/community/Southeastern
#> 65 http://bniajfi.org/community/Cherry%20Hill
#> 66 http://bniajfi.org/community/Loch%20Raven/
#> 67 http://bniajfi.org/community/Sandtown-Winchester_Harlem%20Park/
#> 68 http://bniajfi.org/community/Sandtown-Winchester_Harlem%20Park/
#> 69 http://bniajfi.org/community/Lauraville/
#> 70 http://bniajfi.org/community/Westport_Mount%20Winans_Lakeland/
#> 71 http://bniajfi.org/community/Pimlico_Arlington_Hilltop/
#> 72 http://bniajfi.org/community/Inner%20Harbor_Federal%20Hill/
#> 73 http://bniajfi.org/community/Brooklyn_Curtis%20Bay_Hawkins%20Point/
#> 74 http://bniajfi.org/community/Hamilton/
#> 75 http://bniajfi.org/community/Greater%20Charles%20Village_Barclay/
#> 76 http://bniajfi.org/community/Midway_Coldstream/
#> Category
#> 1 Felony
#> 2 Felony
#> 3 Felony
#> 4 Felony
#> 5 Felony
#> 6 Felony
#> 7 Felony
#> 8 Felony
#> 9 Felony
#> 10 Felony
#> 11 Felony
#> 12 Felony
#> 13 Felony
#> 14 Felony
#> 15 Felony
#> 16 Felony
#> 17 Felony
#> 18 Felony
#> 19 Felony
#> 20 Felony
#> 21 Felony
#> 22 Felony
#> 23 Misdemeanor
#> 24 Felony
#> 25 Felony
#> 26 Felony
#> 27 Felony
#> 28 Felony
#> 29 Felony
#> 30 Felony
#> 31 Felony
#> 32 Felony
#> 33 Felony
#> 34 Felony
#> 35 Felony
#> 36 Misdemeanor
#> 37 Felony
#> 38 Felony
#> 39 Felony
#> 40 Felony
#> 41 Felony
#> 42 Felony
#> 43 Felony
#> 44 Felony
#> 45 Felony
#> 46 Felony
#> 47 Felony
#> 48 Felony
#> 49 Felony
#> 50 Felony
#> 51 Felony
#> 52 Felony
#> 53 Felony
#> 54 Felony
#> 55 Felony
#> 56 Felony
#> 57 Felony
#> 58 Felony
#> 59 Felony
#> 60 Felony
#> 61 Felony
#> 62 Felony
#> 63 Felony
#> 64 Felony
#> 65 Felony
#> 66 Felony
#> 67 Felony
#> 68 Felony
#> 69 Felony
#> 70 Felony
#> 71 Felony
#> 72 Felony
#> 73 Felony
#> 74 Felony
#> 75 Felony
#> 76 FelonyNext, we calculate the crime rates for each year with the piping operator, grouping by community and summarize the rates. In the end we create the crime evolution datasets which is a combination of all the data.
#_____ Calculations of the crime rates
CrimeRatePerArea2021 <- Crime_in_2021 %>%
group_by(Community) %>%
summarize(CrimeRatePerArea2021=(n()/nrow(Crime_in_2021))*100)
CrimeRatePerArea2021 <- rbind(CrimeRatePerArea2021,list("Unassigned -- Jail",0))
CrimeRatePerArea2020 <- Crime_in_2020 %>%
group_by(Community) %>%
summarize(CrimeRatePerArea2020=(n()/nrow(Crime_in_2020))*100)
CrimeRatePerArea2020 <- rbind(CrimeRatePerArea2020,list("Unassigned -- Jail",0))
CrimeRatePerArea2019 <- Crime_in_2019 %>%
group_by(Community) %>%
summarize(CrimeRatePerArea2019=(n()/nrow(Crime_in_2019))*100)
CrimeRatePerArea2019 <- rbind(CrimeRatePerArea2019,list("Unassigned -- Jail",0))
CrimeRatePerArea2018 <- Crime_in_2018 %>%
group_by(Community) %>%
summarize(CrimeRatePerArea2018=(n()/nrow(Crime_in_2018))*100)
CrimeRatePerArea2018 <- rbind(CrimeRatePerArea2018,list("Unassigned -- Jail",0))
CrimeRatePerArea2017 <- Crime_in_2017 %>%
group_by(Community) %>%
summarize(CrimeRatePerArea2017=(n()/nrow(Crime_in_2017))*100)
CrimeRatePerArea2017 <- rbind(CrimeRatePerArea2017,list("Unassigned -- Jail",0))
CrimeRatePerArea2016 <- Crime_in_2016 %>%
group_by(Community) %>%
summarize(CrimeRatePerArea2016=(n()/nrow(Crime_in_2016))*100)
CrimeRatePerArea2016 <- rbind(CrimeRatePerArea2016,list("Unassigned -- Jail",0))
CrimeRatePerArea2015 <- Crime_in_2015 %>%
group_by(Community) %>%
summarize(CrimeRatePerArea2015=(n()/nrow(Crime_in_2015))*100)
CrimeRatePerArea2015 <- rbind(CrimeRatePerArea2015,list("Unassigned -- Jail",0))
CrimeRatePerArea2014 <- Crime_in_2014 %>%
group_by(Community) %>%
summarize(CrimeRatePerArea2014=(n()/nrow(Crime_in_2014))*100)
CrimeRatePerArea2014 <- rbind(CrimeRatePerArea2014,list("Unassigned -- Jail",0))
crime_evolution <- CrimeRatePerArea2021 %>%
left_join(CrimeRatePerArea2020,by="Community") %>%
left_join(CrimeRatePerArea2019,by="Community") %>%
left_join(CrimeRatePerArea2018,by="Community") %>%
left_join(CrimeRatePerArea2017,by="Community") %>%
left_join(CrimeRatePerArea2016,by="Community") %>%
left_join(CrimeRatePerArea2015,by="Community") %>%
left_join(CrimeRatePerArea2014,by="Community")First, we create a CCTV_VS_crimes dataset (which basically is a left joint). Next, we are able to plot visually compare the crime rate per area with the density percentage per area. Here we see that there seems to be a trend.
CCTV_VS_crimes <- CCTV_per_area %>%
left_join(CrimeRatePerArea,by="Community")
View(CCTV_VS_crimes)
plot(CCTV_VS_crimes$CrimeRatePerArea,CCTV_VS_crimes$density_perc, main="Crime Rate per Community VS CCTV Density per Community",xlab="CrimeRatePerCommunity",ylab="CCTVDensityPerCommunity")
regression <- lm(CCTV_VS_crimes$density_perc~CCTV_VS_crimes$CrimeRatePerArea)
summary(regression)
#>
#> Call:
#> lm(formula = CCTV_VS_crimes$density_perc ~ CCTV_VS_crimes$CrimeRatePerArea)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -4.015 -1.036 -0.338 0.948 5.583
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -0.882 0.554 -1.59 0.12
#> CCTV_VS_crimes$CrimeRatePerArea 1.494 0.275 5.43 1.4e-06
#>
#> (Intercept)
#> CCTV_VS_crimes$CrimeRatePerArea ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 1.91 on 54 degrees of freedom
#> Multiple R-squared: 0.353, Adjusted R-squared: 0.341
#> F-statistic: 29.5 on 1 and 54 DF, p-value: 1.36e-06
y<-regression[["coefficients"]][["(Intercept)"]]
x<-regression[["coefficients"]][["CCTV_VS_crimes$CrimeRatePerArea"]]
range <- seq(from=0, to=4.5, by=0.1)
estimation <- x*range+y
lines(range,estimation, col="blue")In order to confirm that perform a regression with the lm-function and call for a summary of the function. Next, the x and y are computed, which are basically the coefficients. Next, we use a trick by creating a range from 0 to 4.5 (because the plot goes ~ from zero to 4.5.) and create a vector. This vector is called estimation. This vector is basically the linear function. So this is the coefficient multiplied by each value in the range plus the intercept. As a result we get the fitted value. The fitted value contains the estimation. After taht we plot the estimation.
In the summary of the regression we see that R^2 (which is the estimator of the goodness of the fit) is pretty poor but still there seems to be a tendency.
In these section we engage with the mapping of the CCTVs and crimes. The method is the same as before with the tmap-package. However, this time we have two different shapes in tm_shape(Baltimore) and tm_shape(balt_dat) which adds the maps togehter (as in ggplot) and over each other. If we take a look at this data we see that it gives an intuition about the data. It seems as if where crime rates are the lowest, there seems to be less CCTVs (for instance in the north area of the city or even in the western CCTVs). There seems to be a correlation between the dark red areas and the CCTV density per area.
#> Warning in sp::proj4string(obj): CRS object has comment, which is
#> lost in output
#> Warning in sp::proj4string(obj): CRS object has comment, which is
#> lost in output
#> Warning in sp::proj4string(obj): CRS object has comment, which is
#> lost in output
#> Warning in sp::proj4string(obj): CRS object has comment, which is
#> lost in output
#> Warning in sp::proj4string(obj): CRS object has comment, which is
#> lost in output
#> Warning in sp::proj4string(obj): CRS object has comment, which is
#> lost in output
#> Warning in sp::proj4string(obj): CRS object has comment, which is
#> lost in output
#> Warning in sp::proj4string(obj): CRS object has comment, which is
#> lost in output
#> Warning in sp::proj4string(obj): CRS object has comment, which is
#> lost in output
Here we sorted the crime rate per area to find out that the range was. Either using breaks or the automatic style with the quantiles.
#> [1] 0.000 0.270 0.362 0.468 0.478 0.512 0.535 0.781 0.981 1.016
#> [11] 1.047 1.067 1.092 1.159 1.161 1.203 1.232 1.298 1.302 1.324
#> [21] 1.349 1.386 1.412 1.442 1.451 1.569 1.577 1.602 1.646 1.652
#> [31] 1.668 1.911 1.926 1.937 1.959 2.013 2.018 2.180 2.203 2.355
#> [41] 2.374 2.389 2.440 2.477 2.685 2.701 2.725 2.730 2.804 2.880
#> [51] 3.057 3.278 3.418 3.647 3.807 4.044
We are trying to see whether the presence of CCTV can deter crime. Here we first try to answer the question where the crime took place (especially in August 2021). We choose AUgust 2021 because it is the latest full month which we have in our dataset. Taking the latest timepoints from the data assures us that most of the CCTVs presented in the dataset were already there (since we have no information of when exactly these CCTVs were added). Again, as before, we create a data table, assign coordinates, define CRS (in this case the CRS is “EPDS4326”, which we needed to transform).Again, mapping with tm_shape to get the results. The output shows where crime takes place compared to the CCTV location.
Next, we decided to calculate the crimerate for AUgust per area to see where crime was highest. The results show that they are in midtown (are midtown). So following this we will take a closer look at the Midtown area.
#> [1] 100
By using the \(st-bbox\) function with their values, which represent the most extreme values on the x-axis and y-axis we can continue with the analysis. So those four values are some geographical values based on coordinate system until we used now (Pseudomercator). And this function assigned that rectangle to be shape file in itself. Now, when we create the midtown map, we can use it as an argument in the tm_shape. THe output is a zoom and the bigger map with a rectangle over the area which we are looking at and analysing.
#> Warning in sp::proj4string(obj): CRS object has comment, which is
#> lost in output
Similar to what we have done before, we first define of what the area of the prison is. It is interesting to analyse the prison and its sourrounding area, since we have many CCTVs around it but basically no crime around it (so it represents an outlier).
#> Warning in sp::proj4string(obj): CRS object has comment, which is
#> lost in output
This is exactly the same what we have done for 4.1.. What we want to see whether there is a difference in terms of correlation between felony & misdemeanor and crime. The results shows a weak r^2, and the answer get even worse in terms of correlation graphically. Here, we do not see bigger impact on crime types of CCTV installation.
#>
#> Call:
#> lm(formula = CCTV_VS_crimes$density_perc ~ FelonyStats$FelonyRatePerArea)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -2.709 -1.525 -0.889 1.306 7.633
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 0.857 0.676 1.27 0.21
#> FelonyStats$FelonyRatePerArea 0.520 0.336 1.55 0.13
#>
#> Residual standard error: 2.32 on 54 degrees of freedom
#> Multiple R-squared: 0.0424, Adjusted R-squared: 0.0246
#> F-statistic: 2.39 on 1 and 54 DF, p-value: 0.128
#>
#> Call:
#> lm(formula = CCTV_VS_crimes$density_perc ~ MisdemeanorStats$MisdemeanorRatePerArea)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -3.214 -1.437 -0.943 0.957 6.333
#>
#> Coefficients:
#> Estimate Std. Error t value
#> (Intercept) 0.694 0.613 1.13
#> MisdemeanorStats$MisdemeanorRatePerArea 0.612 0.297 2.06
#> Pr(>|t|)
#> (Intercept) 0.263
#> MisdemeanorStats$MisdemeanorRatePerArea 0.045 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 2.29 on 54 degrees of freedom
#> Multiple R-squared: 0.0726, Adjusted R-squared: 0.0554
#> F-statistic: 4.23 on 1 and 54 DF, p-value: 0.0446
We went to see whether there is a correlation between CCTV density and wealth. So, similarily, we perform a regression. The results here are not so conclusive, since we have a poor adjusted r^2 and a poor r^2. Visually we can see intersting patterns. If we look at the map we see that at least those areas with no CCTVs are more likely to be quite wealthy. However, we are not sure whether this is the only dependeny here, thus we think it is rather correlated to the crime rate in these areas. Again, in the northern parts we see less CCTV, less crime, and also more wealthier population.
#>
#> Call:
#> lm(formula = poverty_data$hhpov19 ~ CCTV_per_area$density_perc)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -23.311 -8.343 -0.405 7.024 23.902
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 14.11 1.85 7.63 4e-10 ***
#> CCTV_per_area$density_perc 1.29 0.63 2.05 0.046 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 11 on 54 degrees of freedom
#> Multiple R-squared: 0.0719, Adjusted R-squared: 0.0547
#> F-statistic: 4.18 on 1 and 54 DF, p-value: 0.0457
We see a tendency. There seems to be more crimes in poorer areas.
#> [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [14] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [27] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [40] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [53] TRUE TRUE TRUE TRUE
The idea is to find areas which were impacted by a certain type of crime. We see that there was a mainly equal distribution between felony and misdemeanor. We seem to have a strong tendency in downtown - which is at the same time one of the richest areas in Baltimore. Here, we have not enough information to draw conclusions but it could be that there are less felony crimes in this area.
#>
#> Call:
#> lm(formula = poverty_data$hhpov19 ~ CrimeRatePerArea$CrimeRatePerArea)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -19.554 -9.239 -0.727 8.722 25.242
#>
#> Coefficients:
#> Estimate Std. Error t value
#> (Intercept) 10.97 3.20 3.43
#> CrimeRatePerArea$CrimeRatePerArea 3.05 1.59 1.92
#> Pr(>|t|)
#> (Intercept) 0.0012 **
#> CrimeRatePerArea$CrimeRatePerArea 0.0605 .
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 11 on 54 degrees of freedom
#> Multiple R-squared: 0.0637, Adjusted R-squared: 0.0464
#> F-statistic: 3.68 on 1 and 54 DF, p-value: 0.0605
#>
#> Call:
#> lm(formula = FelonyStats$FelonyRatePerArea ~ MisdemeanorStats$MisdemeanorRatePerArea)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -1.6277 -0.3597 -0.0312 0.3219 1.6228
#>
#> Coefficients:
#> Estimate Std. Error t value
#> (Intercept) 0.5163 0.1540 3.35
#> MisdemeanorStats$MisdemeanorRatePerArea 0.7109 0.0747 9.51
#> Pr(>|t|)
#> (Intercept) 0.0015 **
#> MisdemeanorStats$MisdemeanorRatePerArea 3.9e-13 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.575 on 54 degrees of freedom
#> Multiple R-squared: 0.626, Adjusted R-squared: 0.619
#> F-statistic: 90.5 on 1 and 54 DF, p-value: 3.88e-13
jgg ### 4.6 * Answers to the research questions * Different methods considered * Competing approaches * Justifications